154 research outputs found
A Mean Field Approach for Optimization in Particles Systems and Applications
This paper investigates the limit behavior of Markov Decision Processes
(MDPs) made of independent particles evolving in a common environment, when the
number of particles goes to infinity. In the finite horizon case or with a
discounted cost and an infinite horizon, we show that when the number of
particles becomes large, the optimal cost of the system converges almost surely
to the optimal cost of a discrete deterministic system (the ``optimal mean
field''). Convergence also holds for optimal policies. We further provide
insights on the speed of convergence by proving several central limits theorems
for the cost and the state of the Markov decision process with explicit
formulas for the variance of the limit Gaussian laws. Then, our framework is
applied to a brokering problem in grid computing. The optimal policy for the
limit deterministic system is computed explicitly. Several simulations with
growing numbers of processors are reported. They compare the performance of the
optimal policy of the limit system used in the finite case with classical
policies (such as Join the Shortest Queue) by measuring its asymptotic gain as
well as the threshold above which it starts outperforming classical policies
Incentives and Redistribution in Homogeneous Bike-Sharing Systems with Stations of Finite Capacity
Bike-sharing systems are becoming important for urban transportation. In such
systems, users arrive at a station, take a bike and use it for a while, then
return it to another station of their choice. Each station has a finite
capacity: it cannot host more bikes than its capacity. We propose a stochastic
model of an homogeneous bike-sharing system and study the effect of users
random choices on the number of problematic stations, i.e., stations that, at a
given time, have no bikes available or no available spots for bikes to be
returned to. We quantify the influence of the station capacities, and we
compute the fleet size that is optimal in terms of minimizing the proportion of
problematic stations. Even in a homogeneous city, the system exhibits a poor
performance: the minimal proportion of problematic stations is of the order of
(but not lower than) the inverse of the capacity. We show that simple
incentives, such as suggesting users to return to the least loaded station
among two stations, improve the situation by an exponential factor. We also
compute the rate at which bikes have to be redistributed by trucks to insure a
given quality of service. This rate is of the order of the inverse of the
station capacity. For all cases considered, the fleet size that corresponds to
the best performance is half of the total number of spots plus a few more, the
value of the few more can be computed in closed-form as a function of the
system parameters. It corresponds to the average number of bikes in
circulation
Distributing Labels on Infinite Trees
Sturmian words are infinite binary words with many equivalent definitions:
They have a minimal factor complexity among all aperiodic sequences; they are
balanced sequences (the labels 0 and 1 are as evenly distributed as possible)
and they can be constructed using a mechanical definition. All this properties
make them good candidates for being extremal points in scheduling problems over
two processors. In this paper, we consider the problem of generalizing Sturmian
words to trees. The problem is to evenly distribute labels 0 and 1 over
infinite trees. We show that (strongly) balanced trees exist and can also be
constructed using a mechanical process as long as the tree is irrational. Such
trees also have a minimal factor complexity. Therefore they bring the hope that
extremal scheduling properties of Sturmian words can be extended to such trees,
as least partially. Such possible extensions are illustrated by one such
example.Comment: 30 pages, use pgf/tik
Construction of Lyapunov functions via relative entropy with application to caching
International audienceWe consider a system of interacting objects that is a generalization of the model of the cache-replacement policy RAND(m) policy introduced in [6]. We provide a mean-field approximation of this system. We show how to use relative entropy to construct a Lyapunov function for this model. This guarantees that the mean-field model converges to its unique fixed point
The Power of Two Choices on Graphs: the Pair-Approximation is Accurate
International audienceThe power of two-choice is a well-known paradigm to improve load balancing where each incoming task is allocated to the least loaded of two servers picked at random among a collection of n servers [6, 4]. We study the power of two-choice in a setting where the two servers are not picked independently at random but are connected by an edge in an underlying graph. Our problem is motivated by systems in which choices are geometrically constrained (see the model of bike-sharing systems introduced in [1, Section 4]). We study a dynamic setting in which jobs leave the system after being served by a server to which is was allocated. Our focus is when each server has few neighbors (typically 2 to 4) for which an mean-field approximation is not accurate. The static counterpart of our model is studied in [2] in which it is shown by counting the number of arrivals on an edge that the power of two-choice does not hold when the degree is small. This technique cannot be used for studying the dynamic setting as the departures induce long-range dependence. The process is N-dimensional and has no product-form stationary distribution. An exact analytic solution seems out of reach. We use pair-approximation, a technique widespread in biology [5]. We build the equations and show that they describe accurately the steady-state of the system. Our results show that, even in a graph of degree 2, choosing between two neighboring improve dramatically the performance compared to a random allocation. 1. GEOMETRIC TWO-CHOICE MODEL Our system is composed of n identical servers that are connected by an undirected graph (V, E), where the set of vertexes is the set of servers V = {1. .. n}. Each server serves jobs at rate µ and uses a first-come first-serve discipline. Jobs arrive in the system at rate nλ. For each incoming job, one server, say s1, is picked uniformly at random among the n servers. Then, another server s2 is picked uniformly at random among the neighbors of s1. The job is then allocated to the server s1 or s2 that has the least number of jobs (ties are broken at random). This allocation scheme is similar to the one of [2]. We denote the load by ρ = λ/µ and assume that ρ < 1. We now describe a few examples that we will explore numerically in Section 3
Linear Regression from Strategic Data Sources
Linear regression is a fundamental building block of statistical data
analysis. It amounts to estimating the parameters of a linear model that maps
input features to corresponding outputs. In the classical setting where the
precision of each data point is fixed, the famous Aitken/Gauss-Markov theorem
in statistics states that generalized least squares (GLS) is a so-called "Best
Linear Unbiased Estimator" (BLUE). In modern data science, however, one often
faces strategic data sources, namely, individuals who incur a cost for
providing high-precision data.
In this paper, we study a setting in which features are public but
individuals choose the precision of the outputs they reveal to an analyst. We
assume that the analyst performs linear regression on this dataset, and
individuals benefit from the outcome of this estimation. We model this scenario
as a game where individuals minimize a cost comprising two components: (a) an
(agent-specific) disclosure cost for providing high-precision data; and (b) a
(global) estimation cost representing the inaccuracy in the linear model
estimate. In this game, the linear model estimate is a public good that
benefits all individuals. We establish that this game has a unique non-trivial
Nash equilibrium. We study the efficiency of this equilibrium and we prove
tight bounds on the price of stability for a large class of disclosure and
estimation costs. Finally, we study the estimator accuracy achieved at
equilibrium. We show that, in general, Aitken's theorem does not hold under
strategic data sources, though it does hold if individuals have identical
disclosure costs (up to a multiplicative factor). When individuals have
non-identical costs, we derive a bound on the improvement of the equilibrium
estimation cost that can be achieved by deviating from GLS, under mild
assumptions on the disclosure cost functions.Comment: This version (v3) extends the results on the sub-optimality of GLS
(Section 6) and improves writing in multiple places compared to v2. Compared
to the initial version v1, it also fixes an error in Theorem 6 (now Theorem
5), and extended many of the result
Incentives and redistribution in homogeneous bike-sharing systems with stations of finite capacity
International audienceBike-sharing systems are becoming important for urban transportation. In these systems, users arrive at a station, pick up a bike, use it for a while, and then return it to another station of their choice. Each station has a finite capacity: it cannot host more bikes than its capacity. We propose a stochastic model of an homogeneous bike-sharing system and study the effect of the randomness of user choices on the number of problematic stations, i.e., stations that, at a given time, have no bikes available or no available spots for bikes to be returned to. We quantify the influence of the station capacities, and we compute the fleet size that is optimal in terms of minimizing the proportion of problematic stations. Even in a homogeneous city, the system exhibits a poor performance: the minimal proportion of problematic stations is of the order of the inverse of the capacity. We show that simple incentives, such as suggesting users to return to the least loaded station among two stations, improve the situation by an exponential factor. We also compute the rate at which bikes have to be redistributed by trucks for a given quality of service. This rate is of the order of the inverse of the station capacity. For all cases considered, the fleet size that corre-sponds to the best performance is half of the total number of spots plus a few more, the value of the few more can be computed in closed-form as a function of the system parameters. It corresponds to the average number of bikes in circulation
Bias and Refinement of Multiscale Mean Field Models
Mean field approximation is a powerful technique which has been used in many
settings to study large-scale stochastic systems. In the case of two-timescale
systems, the approximation is obtained by a combination of scaling arguments
and the use of the averaging principle. This paper analyzes the approximation
error of this `average' mean field model for a two-timescale model
, where the slow component
describes a population of interacting particles which is fully coupled with a
rapidly changing environment . The model is parametrized by a
scaling factor , e.g. the population size, which as gets large decreases
the jump size of the slow component in contrast to the unchanged dynamics of
the fast component. We show that under relatively mild conditions the `average'
mean field approximation has a bias of order compared to
. This holds true under any continuous performance
metric in the transient regime as well as for the steady-state if the model is
exponentially stable. To go one step further, we derive a bias correction term
for the steady-state from which we define a new approximation called the
refined `average' mean field approximation whose bias is of order .
This refined `average' mean field approximation allows computing an accurate
approximation even for small scaling factors, i.e., . We
illustrate the developed framework and accuracy results through an application
to a random access CSMA model.Comment: 28 page
- …